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NETWORK ENDPOINT SYSTEM WITH ACCELERATED DATA PATH 

■ * 

5 

BACKGROUND OF THE INVENTION 

The present invention relates generally to computing systems, and more particularly to 
network connected computing systems. 

10 

A wide variety of computing systems may be connected to computer networks. These 
systems may include network endpoint systems and network intermediate node systems. 
Servers and clients are typical examples of network endpoint systems and network switches are 
typical examples of network intermediate node computing systems. Many other types of 
1 5 network endpoint systems and network intermediate node systems also exist. 

■ 

As the desire for increased network bandwidth, speed and general performance is always 
present, many different techniques have been utilized to achieve such goals. A common 
technique to address network compute / performance bottlenecks is to simply provide more 

20 network connected resources. For example, in a server environment a compute / performance 
bottleneck is often dealt with by merely adding more servers which operate in parallel. Thus a 
"rack and stack" or "server farm" solution to a network performance problem has developed. 
Merely adding more servers to solve a bottleneck is often wasteful of resources as the bottleneck 
may be related to just one particular type of resource, yet the addition of a complete server, or 

25 sets of servers, provides many types of additional resources. For example, a bottleneck may 
relate to application processor resources yet the addition of more servers also provides additional 
memory resources, network interface resources, etc., all of which may not be required, or under 
utilized. 

* 

30 Alternatively, more elegant attempts to improve network computing performance have 

been made. However, these more elegant solutions generally employ proprietary operating 
systems which have been optimized for specific tasks thereby making these more elegant 
solutions expensive and significantly limited by their underlying system architecture. 
Furthermore, when a wide variety of techniques and operating systems are utilized, the users 

35 implementation and maintenance of such systems becomes a harder task. Also the limited 
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flexibility and versatility of such equipment makes it a very restrictive investment. 

The conventional server architecture uses a PC type computer with a general purpose 
microprocessor (CPU) to perform the server tasks. A network controller provides the network 
5 interface, and the CPU is programmed with server software. All processing tasks are performed 
by the CPU, including the transfer of data from disk to the network controller. This server 
architecture is flexible and permit the server to perform the many tasks involved in servicing 
client requests. However, it is not necessarily the most efficient design. 

10 To improve efficiency, "server appliances" have been developed that use the same basic 

hardware as conventional servers but that whose software is specialized. For example, the 
appliance may be specifically designed to deliver a specific type of content, such as HTTP 
content. However, these appliance type devices tend to be more difficult to upgrade as 
compared to conventional general purpose servers. 

15 

Another approach to improved server performance is the use of "intelligent I/O". This 
approach provides a mechanism for transferring data directly from the disk controller to the 
network controller. This requires adding a processor to the disk controller to perform memory 
access and data transfer tasks. The server CPU sends requests to the disk controller, which 
20 formats the data for delivery to the network controller. However, the disk controller processor 
tends to lack the processing power necessary to effectively unburden the main server CPU. 

Thus, the traditional network computing systems do not provide efficient and scalable 
computing solutions. It is desirable to provide a network computing system that allows for 
25 system scalability and efficient use of system resources, all in a cost effective system that 
operates in a user friendly manner. 

SUMMARY OF THE INVENTION 

Disclosed herein are systems and methods for network connected computing systems 
30 that employ functional multi-processing to optimize bandwidth utilization and accelerate system 
performance. In one embodiment, the network connected computing system may include a 

■ ■ 

switch based computing system. The system may further include an asymmetric multi-processor 
system configured in a staged pipeline manner. The network connected computing system may 
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be utilized in one embodiment as a network endpoint system that provides content delivery. 

The disclosed systems may employ individual modular processing engines that are 
optimized for different layers of a software stack. Each individual processing engine may be 
5 provided with one or more discrete subsystem modules configured to run on their own 
optimized platform and/or to function in parallel with one or more other subsystem modules. A 
high speed distributive interconnect, such as a switch fabric, allows peer-to-peer communication 
between individual subsystem modules. The use of discrete subsystem modules that are 
distributively interconnected in this manner advantageously allows individual resources (e.g., 
10 processing resources, memory resources) to be deployed by sharing or reassignment in order to 
maximize system performance. Furthermore, policy enhancement/enforcement may be 
optimized by placing intelligence in each individual modular processing engine and/or providing 
a system management processing engine. 

1 5 The system is configured to be flexible and scalable. Resources may be reallocated on 

an as need basis for various tasks. Utilizing a multiple processor module or blade arrangement 
connected through a distributive interconnect (for example a switch fabric) provides a system 
that is easily scalable by allowing the installation of additional subsystem modules without 
significant degradation of system performance. 

< 

20 

The network connected computing system may include a processing engine that utilizes 
network processors. Network processors may be used even though the system may be operated 
as a network endpoint system. The network processors may offload tasks from the other 
processors. The network processor may also be utilized to intelligently direct data traffic to the 
25 various other system processors based on configurable policies, system states, etc. 

The system configuration provides for accelerated processing and data / communication 
traffic through the system. One acceleration technique involves the use of a multi-engine system 
with dedicated engines for varying processor tasks. Each engine can perform operations 
30 independently and in parallel with the other engines without the other engines needing to freeze 
or halt operations or share resources. Acceleration is also obtained from the use of multiple 
processor modules within a processing engine. In this manner, parallelism may be achieved 
within a specific processing engine. Thus, multiple processors responding to different requests 



WO 02/39263 PCT/US01/46091 

4 

may be operating in parallel within one engine. Acceleration is also provided by utilizing the 
multi-engine design in a peer to peer environment in which each engine may communicate as a 
peer. Thus, the communications and data paths may skip unnecessary engines. 

5 Acceleration is also achieved by removing or stripping the contents of some protocol 

layers in one processing engine and replacing those layers with identifiers or tags that are 
directly usable by the next processor engine in the data or communications flow path. Network 
processors may be utilized for such processing of the protocol layers. In addition, acceleration 
has been provided through the use of a distributed interconnection such as a switch fabric. A 
10 switch fabric allows for parallel communications between the various engines and helps to 
efficiently implement some of the acceleration techniques described herein. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 A is a representation of components of a content delivery system according to one 
1 5 embodiment of the disclosed content delivery system. 

FIG. IB is a representation of data flow between modules of a content delivery system of 
FIGURE 1 A according to one embodiment of the disclosed content delivery system. 

20 FIG. 1C is a simplified schematic diagram showing one possible network content 

delivery system hardware configuration. • 

FIG. ID is a simplified schematic diagram showing a network content delivery engine 
configuration possible with the network content delivery system hardware configuration of FIG. 
25 1C. 

FIG. IE is a simplified schematic diagram showing an alternate network content delivery 
engine configuration possible with the network content delivery system hardware configuration 
of FIG. 1C. 

30 

FIG. IF is a simplified schematic diagram showing another alternate network content 

* 

delivery engine configuration possible with the network content delivery system hardware 
configuration of FIG. 1C. 
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FIGS. 1G-1 J illustrate exemplary clusters of network content delivery systems. 

FIG. 2 is a simplified schematic diagram showing another possible network; content 
5 delivery system configuration. 

FIG. 2A is a simplified schematic diagram showing a network endpoint computing 

system. 

10 FIG. 2B is a simplified schematic diagram showing a network endpoint computing 

system. 

FIG. 3 is a functional block diagram of an exemplary network processor. 

1 5 FIG. 4 is a functional block diagram of an exemplary interface between a switch fabric 

and a processor. 

DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS 

Disclosed herein are systems and methods for operating network connected computing 
20 systems. The network connected computing systems disclosed provide a more efficient use of 
computing system resources and provide improved performance as compared to traditional 
network connected computing systems. Network connected computing systems may include 
network endpoint systems. The systems and methods disclosed herein may be particularly 
beneficial for use in network endpoint systems. Network endpoint systems may include a wide 
25 variety of computing devices, including but not limited to, classic general purpose servers, 
specialized servers, network appliances, storage area networks or other storage medium, content 
delivery systems, corporate data centers, application service providers, home or laptop 
computers, clients, any other device that operates as an endpoint network connection, etc. 

■ 

30 Other network connected systems may be considered as a network intermediate node 

system. Such systems are generally connected to some node of a network that may operate in 
some other fashion than an endpoint. Typical examples include network switches or network 
routers. Network intermediate node systems may also include any other devices coupled to 
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intermediate nodes of a network. 

> 

Further, some devices may be considered both a network intermediate node system and a 
network endpoint system. Such hybrid systems may perform both endpoint functionality and 
5 intermediate node functionality in the same device. For example, a network switch that also 
performs some endpoint functionality may be considered a hybrid system. As used herein such 
hybrid devices are considered to be a network endpoint system and are also considered to be a 
network intermediate node system. 

10 For ease of understanding, the systems and methods disclosed herein are described with 

regards to an illustrative network connected computing system. In the illustrative example the 
system is a network endpoint system optimized for a content delivery application. Thus a 
content delivery system is provided as an illustrative example that demonstrates the structures, 
methods, advantages and benefits of the network computing system and methods disclosed 

1 5 herein. Content delivery systems (such as systems for serving streaming content, HTTP content, 
cached content, etc.) generally have intensive input / output demands. 

It will be recognized that the hardware and methods discussed below may be 
incorporated into other hardware or applied to other applications. For example with respect to 

20 hardware, the disclosed system and methods may be utilized in network switches. Such 
switches may be considered to be intelligent or smart switches with expanded functionality 
beyond a traditional switch. Referring to the content delivery application described in more 
detail herein, a network switch may be configured to also deliver at least some content in 
addition to traditional switching functionality. Thus, though the system may be considered 

25 primarily a network switch (or some other network intermediate node device), the system may 
incorporate the hardware and methods disclosed herein. Likewise a network switch performing 
applications other than content delivery may utilize the systems and methods disclosed herein. 
The nomenclature used for devices utilizing the concepts of the present invention may vary. The 
network switch or router that includes the content delivery system disclosed herein may be 

30 called a network content switch or a network content router or the like. Independent of the 
nomenclature assigned to a device, it will be recognized that the network device may incorporate 

■ ■ 

some or all of the concepts disclosed herein. 
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The disclosed hardware and methods also may be utilized in storage area networks, 
network attached storage, channel attached storage systems, disk arrays, tape storage systems, 
direct storage devices or other storage systems. In this case, a storage system having the 
traditional storage system functionality may also include additional functionality utilizing the 

5 hardware and methods shown herein. Thus, although the system may primarily be considered a 
storage system, the system may still include the hardware and methods disclosed herein. The 
disclosed hardware and methods of the present invention also may be utilized in traditional 
personal computers, portable computers, servers, workstations, mainframe computer systems, or 
other computer systems. In this case, a computer system having the traditional computer system 

10 functionality associated with the particular type of computer system may also include additional 
functionality utilizing the hardware and methods shown herein. Thus, although the system may 
primarily be considered to be a particular type of computer system, the system may still include 
the hardware and methods disclosed herein. 



1 5 As mentioned above, the benefits of the present invention are not limited to any specific 

tasks or applications. The content delivery applications described herein are thus illustrative 
only. Other tasks and applications that may incorporate the principles of the present invention 
include, but are not limited to, database management systems, application service providers, 
corporate data centers, modeling and simulation systems, graphics rendering systems, other 

20 complex computational analysis systems, etc. Although the principles of the present invention 
may be described with respect to a specific application, it will be recognized that many other 

■ 

tasks or applications performed with the hardware and methods. 

* 

Disclosed herein are systems and methods for delivery of content to computer-based 
25 networks that employ functional multi-processing using a "staged pipeline" content delivery 
environment to optimize bandwidth utilization and accelerate content delivery while allowing 
greater determination in the data traffic management. The disclosed systems may employ 
individual modular processing engines that are optimized for different layers of a software stack. 
Each individual processing engine may be provided with one or more discrete subsystem 
30 modules configured to run on their own optimized platform and/or to function in parallel with 
one or more other subsystem modules across a high speed distributive interconnect, such as a 
switch fabric, that allows peer-to-peer communication between individual subsystem modules. 

» 

The use of discrete subsystem modules that are distributively interconnected in this manner 



■ 
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advantageously allows individual resources (e.g., processing resources, memory resources) to be 
deployed by sharing or reassignment in order to maximize acceleration of content delivery by 
the content delivery system. The use of a scalable packet-based interconnect, such as a switch 
fabric, advantageously allows the installation of additional subsystem modules without 
5 significant degradation of system performance. Furthermore, policy enhancement/enforcement 
may be optimized by placing intelligence in each individual modular processing engine. 

The network systems disclosed herein may operate as network endpoint systems. 
Examples of network endpoints include, but are not limited to, servers, content delivery systems, 
10 storage systems, application service providers, database management systems, corporate data 
center servers, etc. A client system is also a network endpoint, and its resources may typically 
range from those of a general purpose computer to the simpler resources of a network appliance. 
The various processing units of the network endpoint system may be programmed to achieve the 
desired type of endpoint. 

15 

Some embodiments of the network endpoint systems disclosed herein are network 
endpoint content delivery systems. The network endpoint content delivery systems may be 
utilized in replacement of or in conjunction with traditional network servers. A "server" can be 
any device that delivers content, services, or both. For example, a content delivery server 
20 receives requests for content from remote browser clients via the network, accesses a file system 
to retrieve the requested content, and delivers the content to the client. As another example, an 
applications server may be programmed to execute applications software on behalf of a remote . 
client, thereby creating data for use by the client Various server appliances are being developed 
and often perform specialized tasks. 

25 

As will be described more fully below, the network endpoint system disclosed herein 
may include the use of network processors. Though network processors conventionally are 
designed and utilized at intermediate network nodes, the network endpoint system disclosed 
herein adapts this type of processor for endpoint use. 

30 

« 

The network endpoint system disclosed may be construed as a switch based computing 
system. The system may further be characterized as an asymmetric multi-processor system 
configured in a staged pipeline maimer. 
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EXEMPLARY SYSTEM OVERVIEW 

FIG. 1 A is a representation of one embodiment of a content delivery system 1010, for 
example as may be employed as a network endpoint system in connection with a network 1020. 
5 Network 1020 may be any type of computer network suitable for linking computing systems. 
Content delivery system 1010 may be coupled to one or more networks including, but not 
limited to, the public internet, a private intranet network (e.g., linking users and hosts such as 
employees of a corporation or institution), a wide area network (WAN), a local area network 
(LAN), a wireless network, any other client based network or any other network environment of 

10 connected computer systems or online users. Thus, the data provided from the network 1020 
may be in any networking protocol. In one embodiment, network 1020 may be the public 
internet that serves to provide access to content delivery system 1010 by multiple online users 
that utilize internet web browsers on personal computers operating through an internet service 
provider. In this case the data is assumed to follow one or more of various Internet Protocols, 

15 such as TCP/IP, UDP, HTTP, RTSP, SSL, FTP, etc. However, the same concepts apply to 
networks using other existing or future protocols, such as IPX, SNMP, NetBios, Ipv6, etc. The 
concepts may also apply to file protocols such as network file system (NFS) or common internet 
file system (CEFS) file sharing protocol. 

20 Examples of content that may be delivered by content delivery system 1010 include, but 

are not limited to, static content (e.g., web pages, MP3 files, HTTP object files, audio stream 
files, video stream files, etc.), dynamic content, etc. In this regard, static content may be defined 
as content available to content delivery system 1010 via attached storage devices and as content 
that does not generally require any processing before delivery. Dynamic content, on the other 

25 hand, may be defined as content that either requires processing before delivery, or resides 
remotely from content delivery system 1010. As illustrated in FIG. 1A, content sources may 
include, but are not limited to, one or more storage devices 1 090 (magnetic disks, optical disks, 
tapes, storage area networks (SAN's), etc), other content sources 1100, third party remote 
content feeds, broadcast sources (live direct audio or video broadcast feeds, etc.), delivery of 

30 cached content, combinations thereof, etc. Broadcast or remote content may be advantageously 
received through second network connection 1 023 and delivered to network 1 020 via an 
accelerated flowpath through content delivery system 1010. As discussed below, second 
network connection 1023 may be connected to a second network 1024 (as shown). 
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Alternatively, both network connections 1022 and 1023 may be connected to network 1020. 

As shown in FIG. 1A, one embodiment of content delivery system 1010 includes 
multiple system engines 1030, 1040, 1050, 1060, and 1070 communicatively coupled via 

* 

5 distributive interconnection 1080. In the exemplary embodiment provided, these system engines 
operate as content delivery engines. As used herein, "content delivery engine" generally 
includes any hardware, software or hardware/software combination capable of performing one 
or more dedicated tasks or sub-tasks associated with the delivery or transmittal of content from 
one or more content sources to one or more networks. In the embodiment illustrated in FIG. 1 A 

10 content delivery processing engines (or "processing blades") include network interface 
processing engine 1030, storage processing engine 1040, network transport / protocol processing 
engine 1050 (referred to hereafter as a transport processing engine), system management 
processing engine 1060, and application processing engine 1070. Thus configured, content 
delivery system 1010 is capable of providing multiple dedicated and independent processing 

15 engines that are optimized for networking, storage and application protocols, each of which is 
substantially self-contained and therefore capable of functioning without consuming resources 
of the remaining processing engines. 

It will be understood with benefit of this disclosure that the particular number and 
20 identity of content delivery engines illustrated in FIG. 1A are illustrative only, and that for any 
given content delivery system 1010 the number and/or identity of content delivery engines may 
be varied to fit particular needs of a given application or installation. Thus, the number of 
engines employed in a given content delivery system may be greater or fewer in number than 
illustrated in FIG. 1A, and/or the selected engines may include other types of content delivery 
25 engines and/or may not include all of the engine types illustrated in FIG. 1A. In one 
embodiment, the content delivery system 1010 may be implemented within a single chassis, 
such as for example, a 2U chassis. 

* 

i 

Content delivery engines 1030, 1040, 1050, 1060 and 1070 are present to independently 

* 

30 perform selected sub-tasks associated with content delivery from content sources 1090 and/or 
1 100, it being understood however that in other embodiments any one or more of such subtasks 
may be combined and performed by a single engine, or subdivided to be performed by more 
than one engine. In one embodiment, each of engines 1030, 1040, 1050, 1060 and 1070 may 
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employ one or more independent processor modules (e.g., CPU modules) having independent 
processor and memory subsystems and suitable for performance of a given function/s, allowing 
independent operation without interference from other engines or modules. Advantageously, 
this allows custom selection of particular processor-types based on the particular sub-task each 
5 is to perform, and in consideration of factors such as speed or efficiency in performance of a 
given subtask, cost of individual processor, etc. The processors utilized may be any processor 
suitable for adapting to endpoint processing. Any "PC on a board" type device may be used, 
such as the x86 and Pentium processors from Intel Corporation, the SPARC processor from Sun 
Microsystems, Inc., the PowerPC processor from Motorola, Inc. or any other microcontroller or 
10 microprocessor. In addition, network processors (discussed in more detail below) may also be 
utilized. The modular multi-task configuration of content delivery system 1010 allows the 
number and/or type of content delivery engines and processors to be selected or varied to fit the 
needs of a particular application. 

15 The configuration of the content delivery system described above provides scalability 

without having to scale all the resources of a system. Thus, unlike the traditional rack and stack 
systems, such as server systems in which an entire server may be added just to expand one 

» 

segment of system resources, the content delivery system allows the particular resources needed 
to be the only expanded resources. For example, storage resources may be greatly expanded 
20 without having to expand all of the traditional server resources. 

DISTRIBUTIVE INTERCONNECT 

■ 

Still referring to FIG. 1A, distributive interconnection 1080 may be any multi-node I/O 
interconnection hardware or hardware/software system suitable for distributing functionality by 

25 selectively interconnecting two or more content delivery engines of a content delivery system 
including, but not limited to, high speed interchange systems such as a switch fabric or bus 
architecture. Examples of switch fabric architectures include cross-bar switch fabrics, Ethernet 
switch fabrics, ATM switch fabrics, etc. Examples of bus architectures include PCI, PCI-X, S- 
Bus, MicroChannel, VME, etc. Generally, for purposes of this description, a "bus" is any system 

30 bus that carries data in a manner that is visible to all nodes on the bus. Generally, some sort of 
bus arbitration scheme is implemented and data may be carried in parallel, as n-bit words. As 
distinguished from a bus, a switch fabric establishes independent paths from node to node and 
data is specifically addressed to a particular node on the switch fabric. Other nodes do not see 
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the data nor are they blocked from creating their own paths. The result is a simultaneous 
guaranteed bit rate in each direction for each of the switch fabric's ports. 

The use of a distributed interconnect 1080 to connect the various processing engines in 
5 lieu of the network connections used with the switches of conventional multi-server endpoints is 
beneficial for several reasons. As compared to network connections, the distributed interconnect 
1080 is less error prone, allows more deterministic content delivery, and provides higher 
bandwidth connections to the various processing engines. The distributed interconnect 1080 also 
has greatly improved data integrity and throughput rates as compared to network connections. 

« 

10 

Use of the distributed interconnect 1080 allows latency between content delivery engines 

* 

to be short, finite and follow a known path. Known maximum latency specifications are 
typically associated with the various bus architectures listed above. Thus, when the employed 
interconnect medium is a bus, latencies fall within a known range. In the case of a switch fabric, 
15 latencies are fixed. Further, the connections are "direct", rather than by some undetermined 
path. In general, the use of the distributed interconnect 1080 rather than network connections, 
permits the switching and interconnect capacities of the content delivery system 1010 to be 
predictable and consistent. 

20 One example interconnection system suitable for use as distributive interconnection 1080 

is an 8/16 port 28.4 Gbps high speed PRIZMA-E non-blocking switch fabric switch available 
from IBM. It will be understood that other switch fabric configurations having greater or lesser 
numbers of ports, throughput, and capacity are also possible. Among the advantages offered by 
such a switch fabric interconnection in comparison to shared-bus interface interconnection 

25 technology are throughput, scalability and fast and efficient communication between individual 
discrete content delivery engines of content delivery system 1010. In the embodiment of FIG. 
1A, distributive interconnection 1080 facilitates parallel and independent operation of each 
engine in its own optimized environment without bandwidth interference from other engines, 
while at the y same time providing peer-to-peer communication between the engines on an as- 

30 needed basis allowing direct communication between any two content delivery engines 
1030, 1040, 1050, 1060 and 1070). Moreover, the distributed interconnect may directly transfer 
inter-processor communications between the various engines of the system. Thus, 
communication, command and control information may be provided between the various peers 
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* 

via the distributed interconnect. In addition, communication from one peer to multiple peers 
may be implemented through a broadcast communication which is provided from one peer to all 

■ 

peers coupled to the interconnect. The interface for each peer may be standardized, thus 
providing ease of design and allowing for system scaling by providing standardized ports for 
5 adding additional peers. 

* 

NETWORK INTERFACE PROCESSING ENGINE 

As illustrated in FIG. 1A, network interface processing engine 1030 interfaces with 
network 1020 by receiving and processing requests for content and delivering requested content 

10 to network 1020. Network interface processing engine 1030 may be any hardware or 
hardware/software subsystem suitable for connections utilizing TCP (Transmission Control 
Protocol) DP (Internet Protocol), UDP (User Datagram Protocol), RTP (Real-Time Transport 
Protocol), Internet Protocol (IP), Wireless Application Protocol (WAP) as well as other 
networking protocols. Thus the network interface processing engine 1030 may be suitable for 

1 5 handling queue management, buffer management, TCP connect sequence, checksum, IP address 
lookup, internal load balancing, packet switching, etc. Thus, network interface processing 
engine 1030 may be employed as illustrated to process or terminate one or more layers of the 
network protocol stack and to perform look-up intensive operations, offloading these tasks from 
other content delivery processing engines of content delivery system 1010. Network interface 

20 processing engine 1030 may also be employed to load balance among other content delivery 
processing engines of content delivery system 1010. Both of these features serve to accelerate 
content delivery, and are enhanced by placement of distributive interchange and protocol 
termination processing functions on the same board. Examples of other functions that may be 
performed by network interface processing engine 1030 include, but are not limited to, security 

25 processing. 

With regard to the network protocol stack, the stack in traditional systems may often be 
rather large. Processing the entire stack for every request across the distributed interconnect 
may significantly impact performance. As described herein, the protocol stack has been 

* 

30 segmented or "split" between the network interface engine and the transport processing engine. 
An abbreviated version of the protocol stack is then provided across the interconnect. By 
utilizing this functionally split version of the protocol stack, increased bandwidth may be 
obtained. In this manner the communication and data flow through the content delivery system 
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1010 may be accelerated. The use of a distributed interconnect (for example a switch fabric) 
farther enhances this acceleration as compared to traditional bus interconnects. 



The network interface processing engine 1030 may be coupled to the network 1020 
5 through a Gigabit (Gb) Ethernet fiber front end interface 1022. One or more additional Gb 
Ethernet interfaces 1023 may optionally be provided, for example, to form a second interface 
with network 1020, or to form an interface with a second network or application 1024 as shown 
to form an interface with one or more server/s for delivery of web cache content, etc.). 
Regardless of whether the network connection is via Ethernet, or some other means, the network 
10 connection could be of any type, with other examples being ATM, SONET, or wireless. The 
physical medium between the network and the network processor may be copper, optical fiber, 
wireless, etc. 



In one embodiment, network interface processing engine 1030 may utilize a network 
1 5 processor, although it will be understood that in other embodiments a network processor may be 
supplemented with or replaced by a general purpose processor or an embedded microcontroller. 
The network processor may be one of the various types of specialized processors that have been 
designed and marketed to switch network traffic at intermediate nodes. Consistent with this 
conventional application, these processors are designed to process high speed streams of 
20 network packets. In conventional operation, a network processor receives a packet from a port, 
verifies fields in the packet header, and decides on an outgoing port to which it forwards the 
packet. The processing of a network processor may be considered as "pass through" processing, 
as compared to the intensive state modification processing performed by general purpose 
processors. A typical network processor has a number of processing elements, some operating in 
25 parallel and some in pipeline. Often a characteristic of a network processor is that it may hide 
memory access latency needed to perform lookups and modifications of packet header fields. A 
network processor may also have one or more network interface controllers, such as a gigabit 
Ethernet controller, and are generally capable of handling data rates at "wire speeds". 

30 Examples of network processors include the C-Port processor manufactured by 

Motorola, Inc., the D0P1200 processor manufactured by Intel Corporation, the Prism processor 
manufactured by SiTera Inc., and others manufactured by MMC Networks, Inc. and Agere, Inc. 
These processors are programmable, usually with a RISC or augmented RISC instruction set, 
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and are typically fabricated on a single chip. 



The processing cores of a network processor are typically accompanied by special 
purpose cores that perform specific tasks, such as fabric interfacing, table lookup, queue 
5 management, and buffer management. Network processors typically have their memory 
management optimized for data movement, and have multiple I/O and memory buses. The 
programming capability of network processors permit them to be programmed for a variety of 
tasks, such as load balancing, network protocol processing, network security policies, and 
QoS/CoS support These tasks can be tasks that would otherwise be performed by another 

10 processor. For example, TCP/IP processing may be performed by a network processor at the 
front end of an endpoint system. Another type of processing that could be offloaded is 
execution of network security policies or protocols. A network processor could also be used for 
load balancing. Network processors used in this manner can be referred to as "network 
accelerators" because their front end "look ahead" processing can vastly increase network 

15 response speeds. Network processors perform look ahead processing by operating at the front 
end of the network endpoint to process network packets in order to reduce the workload placed 
upon the remaining endpoint resources. Various uses of network accelerators are described in 
the following concurrently filed U.S. patent applications: Serial No. 09/797,412, entitled 
"Network Transport Accelerator," by Bailey et. al; Serial No. 09/797,507, entitled "Single 

20 Chassis Network Endpoint System With Network Processor For Load Balancing," by Richter et. 
al; and Serial No. 09/797,411, entitled "Network Security Accelerator," by Canion et. al; the 
disclosures of which are all incorporated herein by reference. When utilizing network 
processors in an endpoint environment it may be advantageous to utilize techniques for order 
serialization of information, such as for example, as disclosed in concurrently filed U.S. patent 

25 application Serial No. 09/797,197, entitled "Methods and Systems For The Order Serialization 
Of Information In A Network Processing Environment," by Richter et. al, the disclosure of 
which is incorporated herein by reference. 



FIG. 3 illustrates one possible general configuration of a network processor. As 
30 illustrated, a set of traffic processors 21 operate in parallel to handle transmission and receipt of 
network traffic. These processors may be general purpose microprocessors or state machines. 
Various core processors 22 - 24 handle special tasks. For example, the core processors 22 - 24 
may handle lookups, checksums, and buffer management. A set of serial data processors 25 
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provide Layer 1 network support. Interface 26 provides the physical interface to the network 
1020. A general purpose bus interface 27 is used for downloading code and configuration tasks. 

« 

A specialized interface 28 may be specially programmed to optimize the path between network 
processor 12 and distributed interconnection 1080. 

5 

As mentioned above, the network processors utilized in the content delivery system 1010 
are utilized for endpoint use, rather than conventional use at intermediate network nodes. In one 
embodiment, network interface processing engine 1030 may utilize a MOTOROLA C-Port C-5 
network processor capable of handling two Gb Ethernet interfaces at wire speed, and optimized 

10 for cell and packet processing. This network processor may contain sixteen 200 MHz MIPS 
processors for cell/packet switching and thirty-two serial processing engines for bit/byte 
processing, checksum generation/verification, etc. Further processing capability may be 
provided by five co-processors that perform the following network specific tasks: 
supervisor/executive, switch fabric interface, optimized table lookup, queue management, and 

15 buffer management. The network processor may be coupled to the network 1020 by using a 
VITESSE GbE SERDES (seriahzer-deserializer) device (for example the VSC7123) and an SFP 
(small form factor pluggable) optical transceiver for LC fiber connection. 

TRANSPORT / PROTOCOL PROCESSING ENGINE 

20 Referring again to FIG. 1A, transport processing engine 1050 may be provided for 

performing network transport protocol sub-tasks, such as processing content requests received 
from network interface engine 1030. Although named a "transport" engine for discussion 
purposes, it will be recognized that the engine 1050 performs transport and protocol processing 
and the term transport processing engine is not meant to limit the functionality of the engine. In 

25 this regard transport processing engine 1050 may be any hardware or hardware/software 
subsystem suitable for TCP/UDP processing, other protocol processing, transport processing, 
etc. In one embodiment transport engine 1050 may be a dedicated TCP/UDP processing module 

■ 

based on an INTEL PENTIUM m or MOTOROLA POWERPC 7450 based processor running 
the Thread-X RTOS environment with protocol stack based on TCP/IP technology. 

30 

As compared to traditional server type computing systems, the transport processing 

* 

engine 1050 may off-load other tasks that traditionally a main CPU may perform. For example, 
the performance of server CPUs significantly decreases when a large amount of network 
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connections are made merely because the server CPU regularly checks each connection for time 
outs. The transport processing engine 1050 may perform time out checks for each network 
connection, session management, data reordering and retransmission, data queueing and flow 

« 

control, packet header generation, etc. off-loading these tasks from the application processing 
5 engine or the network interface processing engine. The transport processing engine 1 050 may 
also handle error checking, likewise freeing up the resources of other processing engines. 

■ 

NETWORK INTERFACE / TRANSPORT SPLIT PROTOCOL 

The embodiment of FIG. 1A contemplates that the protocol processing is shared between 

10 the transport processing engine 1050 and the network interface engine 1030. This sharing 
technique may be called "split protocol stack" processing. The division of tasks may be such 
that higher tasks in the protocol stack are assigned to the transport processor engine. For 
example, network interface engine 1030 may processes all or some of the TCP/IP protocol stack 
as well as all protocols lower on the network protocol stack. Another approach could be to 

1 5 assign state modification intensive tasks to the transport processing engine. 

In one embodiment related to a content delivery system that receives packets, the 
network interface engine performs the MAC header identification and verification, IP header 
identification and verification, IP header checksum validation, TCP and UDP header 

20 identification and validation, and TCP or UDP checksum validation. It also may perform the 
lookup to determine the TCP connection or UDP socket (protocol session identifier) to which a 
received packet belongs. Thus, the network interface engine verifies packet lengths, checksums, , 
and validity. For transmission of packets, the network interface engine performs TCP or UDP 
checksum generation, IP header generation, and MAC header generation,. IP checksum 

25 generation, MAC FCS/CRC generation, etc. 

Tasks such as those described above can all be performed rapidly by the parallel and 
pipeline processors within a network processor. The "fly by" processing style of a network 
processor permits it to look at each byte of a packet as it passes through, using registers and 
30 other alternatives to memory access. The network processor's "stateless forwarding" operation 
is best suited for tasks not involving complex calculations that require rapid updating of state 
information. 
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An appropriate internal protocol may be provided for exchanging information between 
the network interface engine 1030 and the transport engine 1050 when setting up or terminating 
a TCP and/or UDP connections and to transfer packets between the two engines. For example, 
where the distributive interconnection medium is a switch fabric, the internal protocol may be 
5 implemented as a set of messages exchanged across the switch fabric. These messages indicate 
the arrival of new inbound or outbound connections and contain inbound or outbound packets on 
existing connections, along with identifiers or tags for those connections. The internal protocol 

♦ 

may also be used to transfer identifiers or tags between the transport engine 1050 and the 
application processing engine 1070 and/or the storage processing engine 1040. These identifiers 
10 or tags may be used to reduce or strip or accelerate a portion of the protocol stack. 

For example, with a TCP/IP connection, the network interface engine 1030 may receive 
a request for a new connection. The header information associated with the initial request may 
be provided to the transport processing engine 1050 for processing. That result of this 

15 processing may be stored in the resources of the transport processing engine 1050 as state and 
management information for that particular network session. The transport processing engine 
1050 then informs the network interface engine 1030 as to the location of these results. 
Subsequent packets related to that connection that are processed by the network interface engine 
1030 may have some of the header information stripped and replaced with an identifier or tag 

20 that is provided to the transport processing engine 1050. The identifier or tag may be a pointer, 
index or any other mechanism that provides for the identification of the location in the transport 

■ 

processing engine of the previously setup state and management information (or the 
corresponding network session). In this manner, the transport processing engine 1050 does not 
have to process the header information of every packet of a connection. Rather, the transport 
25 interface engine merely receives a contextually meaningful identifier or tag that identifies the 
previous processing results for that connection. 

• w 

In one embodiment, the data link, network, transport and session layers (layers 2-5) of a 
packet may be replaced by identifier or tag information. For packets related to an established 
30 connection the transport processing engine does not have to perform intensive processing with 
regard to these layers such as hashing, scanning, look up, etc. operations. Rather, these layers 
have already been converted (or processed) once in the transport processing engine and the 
transport processing engine just receives the identifier or tag provided from the network 
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interface engine that identifies the location of the conversion results. 



In this manner an identifier or tag is provided for each packet of an established 
connection so that the more complex data computations of converting header information may 
5 be replaced with a more simplistic analysis of an identifier or tag. The delivery of content is 
thereby accelerated, as the time for packet processing and the amount of system resources for 
packet processing are both reduced. The functionality of network processors, which provide 
efficient parallel processing of packet headers, is well suited for enabling the acceleration 
described herein. In addition, acceleration is further provided as the physical size of the packets 
1 0 provided across the distributed interconnect may be reduced. 



Though described herein with reference to messaging between the network interface 
engine and the transport processing engine, the use of identifiers or tags may be utilized amongst 
all the engines in the modular pipelined processing described herein. Thus, one engine may 
15 replace packet or data information with contextually meaningful information that may require 
less processing by the next engine in the data and communication flow path. In addition, these 
techniques may be utilized for a wide variety of protocols and layers, not just the exemplary 
embodiments provided herein. 



20 With the above-described tasks being performed by the network interface engine, the 

transport engine may perform TCP sequence number processing, acknowledgement and 
retransmission, segmentation and reassembly, and flow control tasks. These tasks generally call 
for storing and modifying connection state information on each TCP and UDP connection, and 
therefore are considered more appropriate for the processing capabilities of general purpose 

25 processors. 

As will be discussed with references to alternative embodiments (such as FIGS. 2 and 
2A), the transport engine 1050 and the network interface engine 1030 may be combined into a 
single engine. Such a combination may be advantageous as communication across the switch 
30 fabric is not necessary for protocol processing. However, limitations of many commercially 
available network processors make the split protocol stack processing described above desirable. 

APPLICATION PROCESSING ENGINE 
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Application processing engine 1070 may be provided in content delivery system 1010 
for application processing, and may be, for example, any hardware or hardware/software 
subsystem suitable for session layer protocol processing (eg., HTTP, RTSP streaming, etc.) of 
content requests received from network transport processing engine 1050. In one embodiment 
5 application processing engine 1070 may be a dedicated application processing module based on 
an INTEL PENTIUM III processor running, for example, on standard x86 OS systems (eg., 
Linux, Windows NT, FreeBSD, etc.). Application processing engine 1070 may be utilized for 
dedicated application-only processing by virtue of the off-loading of all network protocol and 
storage processing elsewhere in content delivery system 1010. In one embodiment, processor 
10 programming for application processing engine 1070 may be generally similar to that of a 
conventional server, but without the tasks off-loaded to network interface processing engine 
1030, storage processing engine 1040, and transport processing engine 1050. 

STORAGE MANAGEMENT ENGINE 

Storage management engine 1040 may be any hardware or hardware/software subsystem 
suitable for effecting delivery of requested content from content sources (for example content 
sources 1090 and/or 1100) in response to processed requests received from application 
processing engine 1070. It will also be understood that in various embodiments a storage 
management engine 1040 may be employed with content sources other than disk drives (e.g., 
solid state storage, the storage systems described above, or any other media suitable for storage 
of data) and may be programmed to request and receive data from these other types of storage. 

In one embodiment, processor programming for storage management engine 1040 may 
be optimized for data retrieval using techniques such as caching, and may include and maintain a 
25 disk cache to reduce the relatively long time often required to retrieve data from content sources, 
such as disk drives. Requests received by storage management engine 1040 from application 
processing engine 1070 may contain information on how requested data is to be formatted and 
its destination, with this information being comprehensible to transport processing engine 1050 
and/or network interface processing engine 1030. The storage management engine 1040 may 
30 utilize a disk cache to reduce the relatively long time it may take to retrieve data stored in a 
storage medium such as disk drives. Upon receiving a request, storage management engine 
1040 may be programmed to first determine whether the requested data is cached, and then to 
send a request for data to the appropriate content source 1090 or 1 100. Such a request may be in 
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the form of a conventional read request. The designated content source 1090 or 1100 responds 
by sending the requested content to storage management engine 1040, which in turn sends the 
content to transport processing engine 1050 for forwarding to network interface processing 
engine 1030. 

5 

Based on the data contained in the request received from application processing engine 
1070, storage processing engine 1040 sends the requested content in proper format with the 
proper destination data included. Direct communication between storage processing engine 
1040 and transport processing engine 1050 enables application processing engine 1070 to be 
10 bypassed with the requested content. Storage processing engine 1040 may also be configured to 
write data to content sources 1090 and/or 1100 (e.g., for storage of live or broadcast streaming 
content). 

In one embodiment storage management engine 1040 may be a dedicated block-level 
15 cache processor capable of block level cache processing in support of thousands of concurrent 
multiple readers, and direct block data switching to network interface engine 1030. In this regard 
storage management engine 1040 may utilize a POWER PC 7450 processor in conjunction with 
ECC memory and a LSI SYMFC929 dual 2GBaud fibre channel controller for fibre channel 
interconnect to content sources 1090 and/or 1 100 via dual fibre channel arbitrated loop 1092. It 
20 will be recognized, however, that other forms of interconnection to storage sources suitable for 
retrieving content are also possible. Storage management engine 1040 may include hardware 
and/or software for running the Fibre Channel (FC) protocol, the SCSI (Small Computer 

* 

Systems Interface) protocol, iSCSI protocol as well as other storage networking protocols. 

25 Storage management engine 1040 may employ any suitable method for caching data, 

including simple computational caching algorithms such as random removal (RR), first-in first- 
out (FIFO), predictive read-ahead, over buffering, etc. algorithms. Other suitable caching 
algorithms include those that consider one or more factors in the manipulation of content stored 
within the cache memory, or which employ multi-level ordering, key based ordering or function 

30 based calculation for replacement. In one embodiment, storage management engine may 
implement a layered multiple LRU (LMLRU) algorithm that uses an integrated block/buffer 

» 

management structure including at least two layers of a configurable number of multiple LRU 
queues and a two-dimensional positioning algorithm for data blocks in the memory to reflect the 
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relative priorities of a data block in the memory in terms of both recency and frequency. Such a 
caching algorithm is described in further detail in concurrently filed U.S. patent application 
Serial No. 09/797,198, entitled "Systems and Methods for Management of Memory" by Qiu et. 
al, the disclosure of which is incorporated herein by reference. 

* 

5 

For increasing delivery efficiency of continuous content, such as streaming multimedia 
content, storage management engine 1040 may employ caching algorithms that consider the 
dynamic characteristics of continuous content. Suitable examples include, but are not limited to, 
interval caching algorithms. In one embodiment, improved caching performance of continuous 

10 content may be achieved using an LMLRU caching algorithm that weighs ongoing viewer cache 
value versus the dynamic time-size cost of maintaining particular content in cache memory. 
Such a caching algorithm is described in further detail in concurrently filed U.S. patent 
application Serial No. 09/797,201, entitled "Systems and Methods for Management of Memory 
in Information Delivery Environments" by Qiu et. al, the disclosure of which is incorporated 

1 5 herein by reference. 

SYSTEM MANAGEMENT ENGINE 

System management (or host) engine 1060 may be present to perform system 
management functions related to the operation of content delivery system 1010. Examples of 

20 system management functions include, but are not limited to, content provisioning/updates, 
comprehensive statistical data gathering and logging for sub-system engines, collection of 
shared user bandwidth utilization and content utilization data that may be input into billing and 
accounting systems, "on the fly" ad insertion into delivered content, customer programmable 
sub-system level quality of service ("QoS") parameters, remote management (e.g., SNMP, web- 

25 based, CLI), health monitoring, clustering controls, remote/local disaster recovery functions, 
predictive performance and capacity planning, etc. In one embodiment, content delivery 
bandwidth utilization by individual content suppliers or users (e.g. 9 individual supplier/user 
usage of distributive interchange and/or content delivery engines) may be tracked and logged by 
system management engine 1060, enabling an operator of the content delivery system 1010 to 

30 charge each content supplier or user on the basis of content volume delivered. 

System management engine 1060 may be any hardware or hardware/software subsystem 
suitable for performance of one or more such system management engines and in one 
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embodiment may be a dedicated application processing module based, for example, on an 
INTEL PENTIUM HI processor running an x86 OS. Because system management engine 1060 
is provided as a discrete modular engine, it may be employed to perform system management 
functions from within content delivery system 1010 without adversely affecting the performance 
5 of the system. Furthermore, the system management engine 1060 may maintain information on 
processing engine assignment and content delivery paths for various content delivery 
applications, substantially eliminating the need for an individual processing engine to have 
intimate knowledge of the hardware it intends to employ. 

10 Under manual or scheduled direction by a user, system management processing engine 

1060 may retrieve content from the network 1020 or from one or more external servers on a 
second network 1024 (e.g., LAN) using, for example, network file system (NFS) or common 
internet file system (CIFS) file sharing protocol. Once content is retrieved, the content delivery 
system may advantageously maintain an independent copy of the original content, and therefore 

15 is free to employ any file system structure that is beneficial, and need not understand low level 
disk formats of a large number of file systems. 

■ 

Management interface 1062 may be provided for interconnecting system management 
engine 1060 with a network 1200 (e.g., LAN), or connecting content delivery system 1010 to 

» 

20 other network appliances such as other content delivery systems 1010, servers, computers, etc. 
Management interface 1062 may be by any suitable network interface, such as 10/100 Ethernet, 
and may support communications such as management and origin traffic. Provision for one or 
more terminal management interfaces (not shown) for may also be provided, such as by RS-232 
port, etc. The management interface may be utilized as a secure port to provide system 

25 management and control information to the content delivery system 1010. For example, tasks 
which may be accomplished through the management interface 1062 include reconfiguration of 
the allocation of system hardware (as discussed below with reference to FIGS. 1CMF), 
programming the application processing engine, diagnostic testing, and any other management 
or control tasks. Though generally content is not envisioned being provided through the 

30 management interface, the identification of or location of files or systems containing content 
may be received through the management interface 1062 so that the content delivery system may 
access the content through the other higher bandwidth interfaces. 
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MANAGEMENT PERFORMED BY THE NETWORK INTEFACE 

Some of the system management functionality may also be performed directly within the 
network interface processing engine 1030. In this case some system policies and filters may be 
executed by the network interface engine 1030 in real-time at wirespeed. These polices and 
5 filters may manage some traffic / bandwidth management criteria and various service level 
guarantee policies. Examples of such system management functionality of are described below. 
It will be recognized that these functions may be performed by the system management engine 
1060, the network interface engine 1030, or a combination thereof. 

* 

10 For example, a content delivery system may contain data for two web sites. An operator 

of the content delivery system may guarantee one web site ("the higher quality site") higher 
performance or bandwidth than the other web site ("the lower quality site"), presumably in 
exchange for increased compensation from the higher quality site. The network interface 
processing engine 1030 may be utilized to determine if the bandwidth limits for the lower 

15 quality site have been exceeded and reject additional data requests related to the lower quality 
site. Alternatively, requests related to the lower quality site may be rejected to ensure the 
guaranteed performance of the higher quality site is achieved. In this manner the requests may 
be rejected immediately at the interface to the external network and additional resources of the 
content delivery system need not be utilized. In another example, storage service providers may 

20 use the content delivery system to charge content providers based on system bandwidth of 
downloads (as opposed to the traditional storage area based fees). For billing purposes, the 
network interface engine may monitor the bandwidth use related to a content provider. The 
network interface engine may also reject additional requests related to content from a content 
provider whose bandwidth limits have been exceeded. Again, in this maimer the requests may 

■ 

25 be rejected immediately at the interface to the external network and additional resources of the 
content delivery system need not be utilized. 

Additional system management functionality, such as quality of service (QoS) 
functionality, also may be performed by the network interface engine. A request from the 
30 external network to the content delivery system may seek a specific file and also may contain 
Quality of Service (QoS) parameters. In one example, the QoS parameter may indicate the 
priority of service that a client on the external network is to receive. The network interface 
engine may recognize the QoS data and the data may then be utilized when managing the data 
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and communication flow through the content delivery system. The request may be transferred to 
the storage management engine to access this file via a read queue, e.g., 
[Destination IP] [Filename] [File Type (CoS)] [Transport Priorities (QoS)]. 
All file read requests may be stored in a read queue. Based on CoS/QoS policy parameters as 
5 well as buffer status within the storage management engine (empty, full, near empty, block seq#, 
etc), the storage management engine may prioritize which blocks of which files to access from 
the disk next, and transfer this data into the buffer memory location that has been assigned to be 
transmitted to a specific IP address. Thus based upon QoS data in the request provided to the 
content delivery system, the data and communication traffic through the system may be 
10 prioritized. The QoS and other policy priorities may be applied to both incoming and outgoing 
traffic flow. Therefore a request having a higher QoS priority may be received after a lower 
order priority request, yet the higher priority request may be served data before the lower 
priority request. 

1 5 The network interface engine may also be used to filter requests that are not supported by 

the content delivery system. For example, if a content delivery system is configured only to 
accept HTTP requests, then other requests such as FTP, telnet, etc. may be rejected or filtered. 
This filtering may be applied directly at the network interface engine, for example by 
programming a network processor with the appropriate system policies. Limiting undesirable 

20 traffic directly at the network interface offloads such functions from the other processing 
modules and improves system performance by limiting the consumption of system resources by 
the undesirable traffic. It will be recognized that the filtering example described herein is 
merely exemplary and many other filter criteria or policies may be provided. 

25 MULTI-PROCESSOR MODULE DESIGN 

As illustrated in FIG. 1A, any given processing engine of content delivery system 1010 
may be optionally provided with multiple processing modules so as to enable parallel or 
redundant processing of data and/or communications. For example, two or more individual 
dedicated TCP/UDP processing modules 1050a and 1050b may be provided for transport 

30 processing engine 1050, two or more individual application processing modules 1070a and 
1070b may be provided for network application processing engine 1070, two or more individual 
network interface processing modules 1030a and 1030b may be provided for network interface 
processing engine 1030 and two or more individual storage management processing modules 
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1040a and 1040b may be provided for storage management processing engine 1040. Using such 
a configuration, a first content request may be processed between a first TCP/UDP processing 
module and a first application processing module via a first switch fabric path, at the same time 
a second content request is processed between a second TCP/UDP processing module and a 
5 second application processing module via a second switch fabric path. Such parallel processing 
capability may be employed to accelerate content delivery. 

Alternatively, or in combination with parallel processing capability, a first TCP/UDP 
processing module 1050a may be backed-up by a second TCP/UDP processing module 1050b 

10 that acts as an automatic failover spare to the first module 1050a. In those embodiments 
employing multiple-port switch fabrics, various combinations of multiple modules may be 
selected for use as desired on an individual system-need basis (e.g., as may be dictated by 
module failures and/or by anticipated or actual bottlenecks), limited only by the number of 
available ports in the fabric. This feature offers great flexibility in the operation of individual 

1 5 engines and discrete processing modules of a content delivery system, which may be translated 
into increased content delivery acceleration and reduction or substantial elimination of adverse 
effects resulting from system component failures. 

, In yet other embodiments, the processing modules may be specialized to specific 
20 applications, for example, for processing and delivering HTTP content, processing and 
delivering RTSP content, or other applications. For example, in such an embodiment an 
application processing module 1070a and storage processing module 1040a may be specially 
programmed for processing a first type of request received from a network. In the same system, 
application processing module 1070b and storage processing module 1040b may be specially 
25 programmed to handle a second type of request different from the first type. Routing of 
requests to the appropriate respective application and/or storage modules may be accomplished 
using a distributive interconnect and may be controlled by transport and/or interface processing 
modules as requests are received and processed by these modules using policies set by the 
system management engine. 

30 

Further, by employing processing modules capable of performing the function of more 
than one engine in a content delivery system, the assigned functionality of a given module may 
be changed on an as-needed basis, either manually or automatically by the system management 
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engine upon the occurrence of given parameters or conditions. This feature may be achieved, 
for example, by using similar hardware modules for different content delivery engines (e.g., by 
employing PENTIUM in based processors for both network transport processing modules and 
for application processing modules), or by using different hardware modules capable of 
5 performing the same task as another module through software programmability (e.g., by 
employing a POWER PC processor based module for storage management modules that are also 
capable of functioning as network transport modules). In this regard, a content delivery system 
may be configured so that such functionality reassignments may occur during system operation, 
at system boot-up or in both cases. Such reassignments may be effected, for example, using 
1 0 software so that in a given content delivery system every content delivery engine (or at a lower 
level, every discrete content delivery processing module) is potentially dynamically 
reconfigurable using software commands. Benefits of engine or module reassignment include 
maximizing use of hardware resources to deliver content while minimizing the need to add 
expensive hardware to a content delivery system. 

15 

Thus, the system disclosed herein allows various levels of load balancing to satisfy a 
work request. At a system hardware level, the functionality of the hardware may be assigned in 
a manner that optimizes the system performance for a given load. At the processing engine 
level, loads may be balanced between the multiple processing modules of a given processing 
20 engine to further optimize the system performance. 

! * I 

I 

4 

CLUSTERS OF SYSTEMS 

The systems described herein may also be clustered together in groups of two or more to 
provide additional processing power, storage connections, bandwidth, etc. Communication 

25 between two individual systems each configured similar to content delivery system 1010 may be 
made through network interface 1022 and/or 1023. Thus, one content delivery system could 
communicate with another content delivery system through the network 1020 and/or 1024. For 
example, a storage unit in one content delivery system could send data to a network interface 
engine of another content delivery system. As an example, these communications could be via 

30 TCP/IP protocols. Alternatively, the distributed interconnects 1080 of two content delivery 
systems 1010 may communicate directly. For example, a connection may be made directly 
between two switch fabrics, each switch fabric being the distributed interconnect 1080 of 
separate content delivery systems 1010. 
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FIGS. 1G-1J illustrate four exemplary clusters of content delivery systems 1010. It will 
be recognized that many other cluster arrangements may be utilized including more or less 
content delivery systems. As shown in FIGS. 1G-1J, each content delivery system may be 
5 configured as described above and include a distributive interconnect 1080 and a network 
interface processing engine 1030. Interfaces 1022 may connect the systems to a network 1020. 
As shown in FIG. 1G, two content delivery systems may be coupled together through the 
interface 1023 that is connected to each system's network interface processing engine 1030. 
FIG. 1H shows three systems coupled together as in FIG. 1G. The interfaces 1023 of each 
10 system may be coupled directly together as shown, may be coupled together through a network 
or may be coupled through a distributed interconnect (for example a switch fabric). 

FIG. II illustrates a cluster in which the distributed interconnects 1080 of two systems 
are directly coupled together through an interface 1500. Interface 1500 may be any 

15 communication connection, such as a copper connection, optical fiber, wireless connection, etc. 
Thus, the distributed interconnects of two or more systems may directly communicate without 
communication through the processor engines of the content delivery systems 1010. FIG. 1J 
illustrates the distributed interconnects of three systems directly communicating without first 
requiring communication through the processor engines of the content delivery systems 1010. 

20 As shown in FIG. 1J, the interfaces 1500 each communicate with each other through another 
distributed interconnect 1600. Distributed interconnect 1600 may be a switched fabric or any 
Qther distributed interconnect. 

The clustering techniques described herein may also be implemented through the use of 
25 the management interface 1062. Thus, communication between multiple content delivery 
systems 1010 also may be achieved through the management interface 1062 

EXEMPLARY DATA AND COMMUNICATION FLOW PATHS 

FIG. IB illustrates one exemplary data and communication flow path configuration 
30 among modules of one embodiment of content delivery system 1010. The flow paths shown in 
FIG. IB are just one example given to illustrate the significant improvements in data processing 
capacity and content delivery acceleration that may be realized using multiple content delivery 
engines that are individually optimized for different layers of the software stack and that are 
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distributively interconnected as disclosed herein. The illustrated embodiment of FIG. IB 
employs two network application processing modules 1070a and 1070b, and two network 
transport processing modules 1050a and 1050b that are communicatively coupled with single 
storage management processing module 1040a and single network interface processing module 
5 1030a. The storage management processing module 1040a is in turn coupled to content sources 
1090 and 1 100. In FIG. IB, inter-processor command or control flow (i.e. incoming or received 
data request) is represented by dashed lines, and delivered content data flow is represented by 
solid lines. Command and data flow between modules may be accomplished through the 
distributive interconnection 1080 (not shown), for example a switch fabric. 

10 

As shown in FIG. IB, a request for content is received and processed by network 
interface processing module 1030a and then passed on to either of network transport processing 
modules 1050a or 1050b for TCP/UDP processing, and then on to respective application 
processing modules 1070a or 1070b, depending on the transport processing module initially 

15 selected. After processing by the appropriate network application processing module, the 
request is passed on to storage management processor 1040a for processing and retrieval of the 
requested content from appropriate content sources 1090 and/or 1100. Storage management 
processing module 1040a then forwards the requested content directly to one of network 
transport processing modules 1050a or 1050b, utilizing the capability of distributive 

20 interconnection 1080 to bypass network application processing modules 1070a and 1070b. The 
requested content may then be transferred via the network interface processing module 1030a to 
the external network 1020. Benefits of bypassing the application processing modules with the 
delivered content include accelerated delivery of the requested content and offloading of 
workload from the application processing modules, each of which translate into greater 

25 processing efficiency and content delivery throughput. In this regard, throughput is generally 
measured in sustained data rates passed through the system and may be measured in bits per 
second. Capacity may be measured in terms of the number of files that may be partially cached, 
the number of TCP/IP connections per second as well as the number of concurrent TCP/IP 
connections that may be maintained or the number of simultaneous streams of a certain bit rate. 

30 In an alternative embodiment, the content may be delivered from the storage management 
processing module to the application processing module rather than bypassing the application 
processing module. This data flow may be advantageous if additional processing of the data is 

» 

desired. For example, it may be desirable to decode or encode the data prior to' delivery to the 
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network. 

To implement the desired command and content flow paths between multiple modules, 
each module may be provided with means for identification, such as a compbnent ID. 
5 Components may be affiliated with content requests and content delivery to effect a desired 
module routing. The data-request generated by the network interface engine may include 
pertinent information such as the component ID of the various modules to be utilized in 
processing the request. For example, included in the data request sent to the storage 
management engine may be the component ID of the transport engine that is designated to 
10 receive the requested content data. When the storage management engine retrieves the data 
from the storage device and is ready to send the data to the next engine, the storage management 
engine knows which component ID to send the data to. 

As further illustrated in FIG. IB, the use of two network transport modules in 
15 conjunction with two network application processing modules provides two parallel processing 
paths for network transport and network application processing, allowing simultaneous 
processing of separate content requests and simultaneous delivery of separate content through 
the parallel processing paths, further increasing throughput/capacity and accelerating content 
delivery. Any two modules of a given engine may communicate with separate modules of 
20 another engine or may communicate with the same module of another engine. This is illustrated 
in FIG. IB where the transport modules are shown to communicate with separate application 
modules and the application modules are shown to communicate with the same storage 
management module. 

25 FIG. IB illustrates only one exemplary embodiment of module and processing flow path 

configurations that may be employed using the disclosed method and system. Besides the 

■ 

embodiment illustrated in FIG. IB, it will be understood that multiple modules may be 
additionally or alternatively employed for one or more other network content delivery engines 
storage management processing engine, network interface processing engine, system 
30 management processing engine, etc.) to create other additional or alternative parallel processing 
flow paths, and that any number of modules (e.g., greater than two) may be employed for a 
given processing engine or set of processing engines so as to achieve more than two parallel 
processing flow paths. For example, in other possible embodiments, two or more different 
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network transport processing engines may pass content requests to the same application unit, or 
vice-versa. 

* 

Thus, in addition to the processing flow paths illustrated in FIG. IB, it will be 
5 understood that the disclosed distributive interconnection system may be employed to create 
other custom or optimized processing flow paths (e.g. t by bypassing and/or interconnecting any 
given number of processing engines in desired sequence/s) to fit the requirements or desired 
operability of a given content delivery application. For example, the content flow path of FIG. 
IB illustrates an exemplary application in which the content is contained in content sources 

10 1090 and/or 1 100 that are coupled to the storage processing engine 1040. However as discussed 
above with reference to FIG. 1 A, remote and/or live broadcast content may be provided to the 
content delivery system from the networks 1020 and/or 1024 via the second network interface 
connection 1023. In such a situation the content may be received by the network interface 
engine 1030 over interface connection 1023 and immediately re-broadcast over interface 

15 connection 1022 to the network 1020. Alternatively, content may be proceed through the 
network interface connection 1023 to the network transport engine 1050 prior to returning to the 
network interface engine 1030 for re-broadcast over interface connection 1022 to the network 
1020 or 1024. In yet another alternative, if the content requires some maimer of application 
processing (for example encoded content that may need to be decoded), the content may proceed 

20 all the way to the application engine 1070 for processing. After application processing the 
content may then be delivered through the network transport engine 1050, network interface 
engine 1030 to the network 1020 or 1024. 

In yet another embodiment, at least two network interface modules 1030a and 1030b 

4 

25 may be provided, as illustrated in FIG. 1 A. In this embodiment, a first network interface engine 
1030a may receive incoming data from a network and pass the data directly to the second 
network interface engine 1030b for transport back out to the same or different network. For 
example, in the remote or live broadcast application described above, first network interface 
engine 1030a may receive content, and second network interface engine 1030b provide the 

30 content to the network 1020 to fulfill requests from one or more clients for this content. Peer-to- 
peer level communication between the two network interface engines allows first network 
interface engine 1030a to send the content directly to second network interface engine 1030b via 
distributive interconnect 1080. If necessary, the content may also be routed through transport 
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processing engine 1050, or through transport processing engine 1050 and application processing 
engine 1070, in a manner described above. 

Still yet other applications may exist in which the content required to be delivered is 
5 contained both in the attached content sources 1090 or 1 100 and at other remote content sources. 
For example in a web caching application, not all content may be cached in the attached content 
. sources, but rather some data may also be cached remotely. In such an application, the data and 
communication flow may be a combination of the various flows described above for content 
provided from the content sources 1090 and 1 100 and for content provided from remote sources 
10 on the networks 1020 and/or 1024. 

The content delivery system 1010 described above is configured in a peer-to-peer 
maimer that allows the various engines and modules to communicate with each other directly as 
peers through the distributed interconnect. This is contrasted with a traditional server 

1 5 architecture in which there is a main CPU. Furthermore unlike the arbitrated bus of traditional 
servers, the distributed interconnect 1080 provides a switching means which is not arbitrated and 
allows multiple simultaneous communications between the various peers. The data and 
communication flow may by-pass unnecessary peers such as the return of data from the storage 
management processing engine 1040 directly to the network interface processing engine 1030 as 

20 described with reference to FIG. IB. 

Communications bet ween the various processor engines may be made through the use of 
a standardized internal protocol. Thus, a standardized method is provided for routing through 
the switch fabric and communicating between any two of the processor engines which operate as 
25 peers in the peer to peer environment. The standardized internal protocol provides a mechanism 
upon which the external network protocols may "ride" upon or be incorporated within. In this 
manner additional internal protocol layers relating to internal communication and data exchange 
may be added to the external protocol layers. The additional internal layers may be provided in 

» 

addition to the external layers or may replace some of the external protocol layers (for example 
30 as described above portions of the external headers may be replaced by identifiers or tags by the 
network interface engine). 

The standardized internal protocol may consist of a system of message classes, or types, 
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where the different classes can independently include fields or layers that are utilized to identify 
the destination processor engine or processor module for communication, control, or data 
messages provided to the switch fabric along with information pertinent to the corresponding 
message class. The standardized internal protocol may also include fields or layers that identify 
5 the priority that a data packet has within the content delivery system. These priority levels may 
be set by each processing engine based upon system-wide policies. Thus, some traffic within 
the content delivery system may be prioritized over other traffic and this priority level may be 
directly indicated within the internal protocol call scheme utilized to enable communications 
within the system. The prioritization helps enable the predictive traffic flow between engines 
1 0 and end-to-end through the system such that service level guarantees may be supported. 

■ 

Other internally added fields or layers may include processor engine state, system 
timestamps, specific message class identifiers for message routing across the switch fabric and 
at the receiving prociessor engine(s), system keys for secure control message exchange, flow 
1 5 control information to regulate control and data traffic flow and prevent congestion, and specific 
address tag fields that allow hardware at the receiving processor engines to move specific types 
of data directly into system memory. 

* 

In one embodiment, the internal protocol may be structured as a set, or system of 
20 messages with common system defined headers that allows all processor engines and, 
potentially, processor engine switch fabric attached hardware, to interpret and process messages 
efficiently and intelligently. This type of design allows each processing engine, and specific 
functional entities within the processor engines, to have their own specific message classes 
optimized functionally for the exchanging their specific types control and data information. 
25 Some message classes that may be employed are: System Control messages for system 
management, Network Interface to Network Transport messages, Network Transport to 
Application Interface messages, File System to Storage engine messages, Storage engine to 
Network Transport messages, etc. Some of the fields of the standardized message header may 
include message priority, message class, message class identifier (subtype), message size, 
30 message options and qualifier fields, message context identifiers or tags, etc. In addition, the 
system statistics gathering, management and control of the various engines may be performed 
across the switch fabric connected system using the messaging capabilities. 
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ft 

By providing a standardized internal protocol, overall system performance may be 
improved. In particular, communication speed between the processor engines across the switch 
fabric may be increased. Further, communications between any two processor engines may be 
enabled. The standardized protocol may also be utilized to reduce the processing loads of a 
5 given engine by reducing the amount of data that may need to be processed by a given engine. 

The internal protocol may also be optimized for a particular system application, 
providing further performance improvements. However, the standardized internal 
communication protocol may be general enough to support encapsulation of a wide range of 

10 networking and storage protocols. Further, while internal protocol may run on PCI, PCI-X, 
ATM, IB, Lightening I/O, the internal protocol is a protocol above these transport-level 
standards and is optimal for use in a switched (non-bus) environment such as a switch fabric. In 
addition, the internal protocol may be utilized to communicate devices (or peers) connected to 
the system in addition to those described herein. For example, a peer need not be a processing 

15 engine. In one example, a peer, may be an ASIC protocol converter that is coupled to the 
distributed interconnect as a peer but operates as a slave device to other master devices within 
the system. The internal protocol may also be as a protocol communicated between systems 
such as used in the clusters described above. 

20 Thus a system has been provided in which the networking / server clustering / storage 

networking has been collapsed into a single system utilizing a common low-overhead internal 
communication protocol / transport system. 

CONTENT DELIVERY ACCELERATION 
25 As described above, a wide range of techniques have been provided for accelerating 

content delivery from the content delivery system 1010 to a network. By accelerating the speed 
at which content may be delivered, a more cost effective and higher performance system may be 
provided. These techniques may be utilized separately or in various combinations. 

30 One content acceleration technique involves the use of a multi-engine system with 

dedicated engines for varying processor tasks. Each engine can perform operations 
independently and in parallel with the other engines without the other engines needing to freeze 
or halt operations. The engines do not have to compete for resources such as memory, I/O, 
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processor time, etc. but are provided with their own resources. Each engine may also be tailored 
in hardware and/or software to perform specific content delivery task, thereby providing 
increasing content delivery speeds while requiring less system resources. Further, all data, 
regardless of the flow path, gets processed in a staged pipeline fashion such that each engine 
5 continues to process its layer of functionality after forwarding data to the next engine / layer. 

Content acceleration is also obtained from the use of multiple processor modules within 
an engine. In this manner, parallelism may be achieved within a specific processing engine. 
Thus, multiple processors responding to different content requests may be operating in parallel 
10 within one engine. 

Content acceleration is also provided by utilizing the multi-engine design in a peer to 
peer environment in which each engine may communicate as a peer. Thus, the communications 
and data paths may skip unnecessary engines. For example, data may be communicated directly 
15 from the storage processing engine to the transport processing engine without have to utilize 
resources of the application processing engine. 

Acceleration of content delivery is also achieved by removing or stripping the contents 
of some protocol layers in one processing engine and replacing those layers with identifiers or 
20 tags for use with the next processor engine in the data or communications flow path, Thus, the 
processing burden placed on the subsequent engine may be reduced. In addition, the packet size 
transmitted across the distributed interconnect may be reduced. Moreover, protocol processing 
may be off-loaded from the storage and/or application processors, thus freeing those resources to 
focus on storage or application processing. 

25 

Content acceleration is also provided by using network processors in a network endpoint 

« 

* 

system. Network processors generally are specialized to perform packet analysis functions at 

intermediate network nodes, but in the content delivery system disclosed the network processors 

■ 

have been adapted for endpoint functions. Furthermore, the parallel processor configurations 
30 within a network processor allow these endpoint functions to be performed efficiently. 

In addition, content acceleration has been provided through the use of a distributed 
interconnection such as a switch fabric. A switch fabric allows for parallel communications 
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between the various engines and helps to efficiently implement some of the acceleration 
techniques described herein. 

It will be recognized that other aspects of the content delivery system 1010 also provide 
5 for accelerated delivery of content to a network connection. Further, it will be recognized that 
the techniques disclosed herein may be equally applicable to other network endpoint systems 
and even non-endpoint systems. 

■ 

EXEMPLARY HARDWARE EMBODIMENTS 

10 FIGS. 1C-1F illustrate just a few of the many multiple network content delivery engine 

configurations possible with one exemplary hardware embodiment of content delivery system 
1010. In each illustrated configuration of this hardware embodiment, content delivery system 
1010 includes processing modules that may be configured to operate as content delivery engines 
1030, 1040, 1050, 1060, and 1070 communicatively coupled via distributive interconnection 

15 1080. As shown in FIG. 1C, a single processor module may operate as the network interface 
processing engine 1030 and a single processor module may operate as the system management 
processing engine 1060. Four processor modules 1001 may be configured to operate as either 
the transport processing engine 1050 or the application processing engine 1070. Two processor 
modules 1003 may operate as either the storage processing engine 1040 or the transport 

20 processing engine 1050. The Gigabit (Gb) Ethernet front end interface 1022, system 
management interface 1062 and dual fibre channel arbitrated loop 1092 are also shown. 

As mentioned above, the distributive interconnect 1080 may be a switch fabric based 
interconnect. As shown in FIG. 1C, the interconnect may be an IBM PRIZMA-E eight/sixteen 

25 port switch fabric 1081. In an eight port mode, this switch fabric is an 8 x 3.54 Gbps fabric and 
in a sixteen port mode, this switch fabric is a 16 x 1.77 Gbps fabric. The eight/sixteen port 
switch fabric may be utilized in an eight port mode for performance optimization. The switch 
fabric 1081 may be coupled to the individual processor modules through interface converter 
circuits 1082, such as IBM UDASL switch interface circuits. The interface converter circuits 

30 1082 convert the data aligned serial link interface (DASL) to a UTOPIA (Universal Test and 
Operations PHY Interface for ATM) parallel interface. FPGAs (field programmable gate array) 
may be utilized in the processor modules as a fabric interface on the processor modules as 
shown in FIG. 1C. These fabric interfaces provide a 64/66Mhz PCI interface to the interface 
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converter circuits 1082. FIG. 4 illustrates a functional block diagram of such a fabric interface 
34. As explained below, the interface 34 provides an interface between the processor module 
bus and the UDASL switch interface converter circuit 1082. As shown in FIG. 4, at the switch 
fabric side, a physical connection interface 41 provides connectivity at the physical level to the 

■ 

5 switch fabric. An example of interface 41 is a parallel bus interface complying with the 
UTOPIA standard. In the example of FIG. 4, interface 41 is a UTOPIA 3 interface providing a 
32-bit 110 Mhz connection. However, the concepts disclosed herein are not protocol dependent 
and the switch fabric need not comply with any particular ATM or non ATM standard. 

10 Still referring to FIG. 4, SAR (segmentation and reassembly) unit 42 has appropriate 

SAR logic 42a for performing segmentation and reassembly tasks for converting messages to 

■ 

fabric cells and vice-versa as well as message classification and message class-to-queue routing, 
using memory 42b and 42c for transmit and receive queues. This permits different classes of 
messages and permits the classes to have different priority. For example, control messages can 
15 be classified separately from data messages, and given a different priority. All fabric cells and 
the associated messages may be self routing, and no out of band signaling is required. 

A special memory modification scheme permits one processor module to write directly 
into memory of another. This feature is facilitated by switch fabric interface 34 and in particular 
20 by its message classification capability. Commands and messages follow the same path through 
switch fabric interface 34, but can be differentiated from other control and data messages. In 
this manner, processes executing on processor modules can communicate directly using their 

* 

own memory spaces. 

■ 

25 Bus interface 43 permits switch fabric interface 34 to communicate with the processor of 

the processor module via the module device or I/O bus. An example of a suitable bus 
architecture is a PCL architecture, but other architectures could be used. Bus interface 43 is a 
master/target device, permitting interface 43 to write and be written to and providing appropriate 
bus control. The logic circuitry within interface 43 implements a state machine that provides the 

30 communications protocol, as well as logic for configuration and parity. 

Referring again to FIG. 1C, network processor 1032 (for example a MOTOROLA C- 
Port C-5 network processor) of the network interface processing engine 1030 may be coupled 
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directly to an interface converter circuit 1082 as shown. As mentioned above and further shown 
in FIG. 1C, the network processor 1032 also may be coupled to the network 1020 by using a 
VITESSE GbE SERDES (serializer-deserializer) device (for example the VSC7123) and an SFP 
(small form factor pluggable) optical transceiver for LC fibre connection. 

5 

The processor modules 1003 include a fibre channel (FC) controller as mentioned above 
and further shown in FIG. 1C. For example, the fibre channel controller may be the LSI 
SYMFC929 dual 2GBaud fibre channel controller. The fibre channel controller enables 
communication with the fibre channel 1092 when the processor module 1003 is utilized as a 
10 storage processing engine 1040. Also illustrated in FIGS. 1C-1F is optional adjunct processing 

■ 

unit 1300 that employs a POWER PC processor with SDRAM. The adjunct processing unit is 

shown coupled to network processor 1032 of network interface processing engine 1030 by a PCI 

i 

interface. Adjunct processing unit 1300 may be employed for monitoring system parameters 
such as temperature, fan operation, system health, etc. 

15 

As shown in FIGS. 1C-1F, each processor module of content delivery engines 1030, 
1040, 1050, 1060, and 1070 is provided with its own synchronous dynamic random access 
memory ("SDRAM") resources, enhancing the independent operating capabilities of each 

■ 

module. The memory resources may be operated as ECC (error correcting code) memory. 

20 Network interface processing engine 1030 is also provided with static random access memory 
("SRAM"). Additional memory circuits may also be utilized as will be recognized by those 
skilled in the art. For example, additional memory resources (such as synchronous SRAM and 
non-volatile FLASH and EEPROM) may be provided in conjunction with the fibre channel 
controllers. In addition, boot FLASH memory may also be provided on the of the processor 

25 modules. 

i 

The processor modules 1001 and 1003 of FIG. 1C may be configured in alternative 
maimers to implement the content delivery processing engines such as the network interface 
processing engine 1030, storage processing engine 1040, transport processing engine 1050, 
30 system management processing engine 1060, and application processing engine 1070. 
Exemplary configurations are shown in FIGS. 1D-1F,. however, it will be recognized that other 
configurations may be utilized. 
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As shown in FIG. ID, two Pentium HI based processing modules may be utilized as 
network application processing modules 1070a and 1070b of network application processing 
engine 1070. The remaining two Pentium Ill-based processing modules are shown in FIG. ID 
configured as network transport / protocol processing modules 1050a and 1050b of network 
5 transport / protocol processing engine 1050. The embodiment of FIG. ID also includes two 
POWER PC-based processor modules, configured as storage management processing modules 
1040a and 1040b of .storage management processing engine 1040. A single MOTOROLA C- 
Port C-5 based network processor is shown employed as network interface processing engine 
1030, and a single Pentium IE-based processing module is shown employed as system 
1 0 management processing engine 1 060. 

In FIG. IE, the same hardware embodiment of FIG. 1C is shown alternatively configured 
so that three Pentium IE-based processing modules function as network application processing 
modules 1070a, 1070b ahd 1070c of network application processing engine 1070, and so that the 
15 sole remaining Pentium IE-based processing module is configured as a network transport 
processing module 1050a of network transport processing engine 1050. As shown, the 
remaining processing modules are configured as in FIG. ID. 

* 

In FIG. IF,, the same hardware embodiment of FIG. 1C is shown in yet another alternate 
20 configuration so that three Pentium El-based processing modules function as application 
processing modules 1070a, 1070b and 1070c of network application processing engine 1070. In 
addition, the network transport processing engine 1050 includes one Pentium El-based 
processing module that is configured as network transport processing module 1050a, and one 
POWER PC-based processing module that is configured as network transport processing module 
25 1050b. The remaining POWER PC-based processor module is configured as storage 
management processing module 1040a of storage management processing engine 1040. 

It will be understood with benefit of this disclosure that the hardware embodiment and 
multiple engine configurations thereof illustrated in FIGS. 1C-1F are exemplary only, and that 
30 other hardware embodiments and engine configurations thereof are also possible. It will further 
be understood that in addition to changing the assignments of individual processing modules to 
particular processing engines, distributive interconnect 1080 enables the various processing flow 
paths between individual modules employed in a particular engine configuration in a manner as 

t 
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described in relation to FIG. IB. Thus, for any given hardware embodiment and processing 
engine configuration, a number of different processing flow paths may be employed so as to 
optimize system performance to suit the needs of particular system applications. 

5 SINGLE CHASSIS DESIGN 

As mentioned above, the content delivery system 1010 may be implemented within a 
single chassis, such as for example, a 2U chassis. The system may be expanded further while 
still remaining a single chassis system. In particular, utilizing a multiple processor module or 
blade arrangement connected through a distributive interconnect (for example a switch fabric) 

10 provides a system that is easily scalable. The chassis and interconnect may be configured with 
expansion slots proyided for adding additional processor modules. Additional processor 
modules may be provided to implement additional applications within the same chassis. 
Alternatively, additional processor modules may be provided to scale the bandwidth of the 
network connection. Thus, though describe with respect to a lGbps Ethernet connection to the 

15 external network, a 10 Gbps, 40 Gbps or more connection may be established by the system 
through the use of more network interface modules. Further, additional processor modules may 
be added to address a system's particular bottlenecks without having to expand all engines of the 
system. The additional modules may be added during a systems initial configuration, as an 
upgrade during system maintenance or even hot plugged during system operation. 

20 

ALTERNATIVE SYSTEMS CONFIGURATIONS 

Further, the network endpoint system techniques disclosed herein may be implemented 
in a variety of alternative configurations that incorporate some, but not necessarily all, of the 
concepts disclosed herein. For example, FIGS. 2 and 2A disclose two exemplary alternative 
25 configurations. It will be recognized, however, that many other alternative configurations may 
be utilized while still gaining the benefits of the inventions disclosed herein. 

FIG. 2 is a more generalized and functional representation of a content delivery system 
showing how such a system may be alternately configured to have one or more of the features of 
30 the content delivery system embodiments illustrated in FIGS. 1A-1F. FIG. 2 shows content 
delivery system 200 coupled to network 260 from which content requests are received and to 
which content is delivered. Content sources 265 are shown coupled to content delivery system 
200 via a content delivery flow path 263 that may be, for example, a storage area network that 



WO 02/39263 PCT/US01/46091 

41 

links multiple content sources 265. A flow path 203 may be provided to network connection 
272, for example, to couple content delivery system 200 with other network appliances, in this 
case one or more servers 201 as illustrated in FIG. 2. 

5 In FIG. 2 content delivery system 200 is configured with multiple processing and 

memory modules that are distributively interconnected by inter-process communications path 
230 and inter-process data movement path 235. Inter-process communications path 230 is 
provided for receiving and distributing inter-processor command communications between the 
modules and network 260, and interprocess data movement path 235 is provided for receiving 
10 and distributing inter-processor data among the separate modules. As illustrated in FIGS. 1A- 
IF, the functions of inter-process communications path 230 and inter-process data movement 
path 235 may be together handled by a single distributive interconnect 1080 (such as a switch 
fabric, for example), however, it is also possible to separate the communications and data paths 
as illustrated in FIG. 2, for example using other interconnect technology. 

15 

FIG. 2 illustrates a single networking subsytem processor module 205 that is provided to 
perform the combined functions of network interface processing engine 1030 and transport 
processing engine 1050 of FIG. 1A. Communication and content delivery between network 260 
and networking subsystem processor module 205 are made through network connection 270. 

20 For certain applications, the functions of network interface processing engine 1030 and transport 
processing engine 1050 of FIG. 1A may be so combined into a single module 205 of FIG. 2 in 
order to reduce the level of communication and data traffic handled by communications path 230 
and data movement path 235 (or single switch fabric), without adversely impacting the resources 
of application processing engine or subsystem module. If such a modification were made to the 

25 system of FIG. 1A, content requests may be passed directly from the combined 
interface/transport engine to network application processing engine 1070 via distributive 
interconnect 1080. Thus, as previously described the functions of two or more separate content 
delivery system engines may be combined as desired (e.g., in a single module or in multiple 
modules of a single processing blade), for example, to achieve advantages in efficiency or cost. 

30 

In the embodiment of FIG. 2, the function of network application processing engine 1070 
of FIG. 1A is performed by application processing subsytem module 225 of FIG. 2 in 
conjunction with application RAM subsystem module 220 of FIG. 2. System monitor module 
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240 communicates with server/s 201 through flow path 203 and Gb Ethernet network interface 
connection 272 as also shown in FIG. 2. The system monitor module 240 may provide the 
function of the system management engine 1060 of FIG. 1A and/or other system policy / filter 
functions such as may also be implemented in the network interface processing engine 1030 as 
5 described above with .reference to FIG. 1 A. 

Similarly, the function of network storage management engine 1040 is performed by 
storage subsytem module 210 in conjunction with file system cache subsystem module 215. 
Communication and content delivery between content sources 265 and storage subsystem 
10 module 210 are shown made directly through content delivery flowpath 263 through fibre 
channel interface connection 212. Shared resources subsystem module 255 is shown provided 
for access by each of the other subsystem modules and may include, for example, additional 
processing resources, additional memory resources such as RAM, etc. 

15 Additional processing engine capability (eg., additional system management processing 

capability, additional application processing capability, additional storage processing capability, 
encryption / decryption processing capability, compression / decompression processing 
capability, encoding / decoding capability, other processing capability, etc.) may be provided as 
desired and is represented by other subsystem module 275. Thus, as previously described the 

20 functions of a single network processing engine may be sub-divided between separate modules 
that are distributively interconnected. The sub-division of network processing engine tasks may 
also be made for reasons of efficiency or cost, and/or may be taken advantage of to allow 
resources (e.g., memory or processing) to be shared among separate modules. Further, 
additional shared resources may be made available to one or more separate modules as desired. 

25 

Also illustrated in FIG. 2 are optional monitoring agents 245 and resources 250. In the 
embodiment of FIG. 2, each monitoring agent 245 may be provided to monitor the resources 250 
of its respective processing subsystem module, and may track utilization of these resources both 
within the overall system 200 and within its respective processing subsystem module. Examples 
30 of resources that may be so monitored and tracked include, but are not limited to, processing 
engine bandwidth, Fibre Channel bandwidth, number of available drives, IOPS (input/output 
operations per second) per drive and RAID (redundant array of inexpensive discs) levels of 
storage devices, memory available for caching blocks of data, table lookup engine bandwidth, 
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availability of RAM for connection control structures and outbound network bandwidth 
availability, shared resources (such as RAM) used by streaming application on a per-stream 
basis as well as for use with connection control structures and buffers, bandwidth available for 
message passing between subsystems, bandwidth available for passing data between the various 
5 subsystems, etc. 

Information gathered by monitoring agents 245 may be employed for a wide variety of 
purposes including for billing of individual content suppliers and/or users for pro-rata use of one 
or more resources, resource use analysis and optimization, resource health alarms, etc. In 
10 addition, monitoring agents may be employed to enable the deterministic delivery of content by 
system 200 as described in concurrently filed, co-pending United States patent application Serial 
No. 09/797,200, entitled "Systems and Methods for the Deterministic Management of 
Information;' which is incorporated herein by reference. 

1 5 In operation, content delivery system 200 of FIG. 2 may be configured to wait for a 

request for content or services prior to initiating content delivery or performing a service. A 
request for content, such as a request for access to data, may include, for example, a request to 
start a video stream, a request for stored data, etc. A request for services may include, for 
example, a request for to run an application, to store a file, etc. A request for content or services 

20 may be received from a variety of sources. For example, if content delivery system 200 is 
employed as a stream server, a request for content may be received from a client system attached 
to a computer network or communication network such as the Internet. In a larger system 
environment, e.g., a- data center, a request for content or services may be received from a 
separate subcomponent or a system management processing engine, that is responsible for 

25 performance of the overall system or from a sub-component that is unable to process the current 
request. Similarly, a request for content or services may be received by a variety of components 
of the receiving system. For example, if the receiving system is a stream server, networking 
subsytem processor module 205 might receive a content request. Alternatively, if the receiving 
system is a component of a larger system, e.g., a data center, system management processing 

30 engine may be employed to receive the request. 

Upon receipt of a request for content or services, the request may be filtered by system 
monitor 240. Such filtering may serve as a screening agent to filter out requests that the 
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receiving system is not capable of processing (eg., requests for file writes from read-only 
system embodiments, unsupported protocols, content/services unavailable on system 200, etc.). 
Such requests may be rejected outright and the requestor notified, may be re-directed to a server 
201 or other content delivery system 200 capable of handling the request, or may be disposed of 
5 any other desired manner. 

Referring now in more detail to one embodiment of FIG. 2 as may be employed in a 
stream server configuration, networking processing subsystem module 205 may include the 
hardware and/or software used to run TCP/IP (Transmission Control Protocol/Internet Protocol), 

10 UDP/EP (User Datagram Protocol/Internet Protocol), RTP (Real-Time Transport Protocol), 
Internet Protocol (IP), Wireless Application Protocol (WAP) as well as other networking 
protocols. Network interface connections 270 and 272 may be considered part of networking 
subsystem processing module 205 or as separate components. Storage subsystem module 210 
may include hardware and/or software for running the Fibre Channel (FC) protocol, the SCSI 

15 (Small Computer Systems Interface) protocol, iSCSI protocol as well as other storage 
networking protocols. FC interface 212 to content delivery flowpath 263 may be considered 
part of storage subsystem module 210 or as a separate component. File system cache subsystem 
module 215 may include, in addition to cache hardware, one or more cache management 
» algorithms as well as other software routines. 

20 

Application RAM subsystem module 220 may function as a memory allocation 
subsystem and application processing subsytem module 225 may function as a stream-serving 
application processor bandwidth subsytem. Among other services, application RAM subsystem 
module 220 and application processing subsytem module 225 may be used to facilitate such 
25 services as the pulling of content from storage and/or cache, the formatting of content into RTSP 
(Real-Time Streaming Protocol) or another streaming protocol as well the passing of the 

* 

formatted content to networking subsystem 205. 

As previously described, system monitor module 240 may be included in content 
30 delivery system 200 to manage one or more of the subsystem processing modules, and may also 
be used to facilitate communication between the modules. 

In part to allow communications between the various subsystem modules of content 
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delivery system 200, inter-process communication path 230 may be included in content delivery 
system 200, and may be provided with its own monitoring agent 245. Inter-process 
communications path 230 may be a reliable protocol path employing a reliable IPC (Inter- 
process Communications) protocol. To allow data or information to be passed between the 
5 various subsystem modules of content delivery system 200, inter-process data movement path 
235 may also be included in content delivery system 200, and may be provided with its own 
monitoring agent 245. As previously described, the functions of inter-process communications 
path 230 and inter-process data movement path 235 may be together handled by a single 
distributive interconnect 1080, that may be a switch fabric configured to support the bandwidth 
10 of content being served. 

In one embodiment, access to content source 265 may be provided via a content delivery 
* flow path 263 that is a fibre channel storage area network (SAN), a switched technology. In 
addition, network connectivity may be provided at network connection 270 (e.g., to a front end 
15 network) and/or at network connection 272 (e.g., to a back end network) via switched gigabit 
Ethernet in conjunction with the switch fabric internal communication system of content 
delivery system 200. As such, that the architecture illustrated in FIGURE 2 may be generally 
characterized as equivalent to a networking system. 

■ 

One or more shared resources subsystem modules 255 may also be included in a stream 
server embodiment of content delivery system 200, for sharing by one or more of the other 
subsystem modules. Shared resources subsystem module 255 may be monitored by the 
monitoring agents 245 of each subsystem sharing the resources. The monitoring agents 245 of 
each subsystem module may also be capable of tracking usage of shared resources 255, As 
previously described, shared resources may include RAM (Random Access Memory) as well as 
other types of shared resources. 

Each monitoring agent 245 may be present to monitor one or more of the resources 250 
of its subsystem processing module as well as the utilization of those resources both within the 
overall system and within the respective subsystem processing module. For example, 
monitoring agent 245 of storage subsystem module 210 may be configured to monitor and track 
usage of such resources as processing engine bandwidth, Fibre Channel bandwidth to content 
delivery flow path 263, number of storage drives attached, number of input/output operations 
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per second (IOPS) per drive and RAID levels of storage devices that may be employed as 
content sources 265. Monitoring agent 245 of file system cache subsystem module 215 may be 
employed monitor and track usage of such resources as processing engine bandwidth and 
memory employed for caching blocks of data. Monitoring agent 245 of networking subsystem 
5 processing module 205 may be employed to monitor and track usage of such resources as 
processing engine bandwidth, table lookup engine bandwidth, RAM employed for connection 
control structures and outbound network bandwidth availability. Monitoring agent 245 of 
application processing subsystem module 225 may be employed to monitor and track usage of 
processing engine bandwidth. Monitoring agent 245 of application RAM subsystem module 

10 220 may be employed to monitor and track usage of shared resource 255, such as RAM, which 
may be employed by a streaming application on a per-stream basis as well as for use with 
connection control structures and buffers. Monitoring agent 245 of inter-process 
communication path 230 may be employed to monitor and track usage of such resources as the 
bandwidth used for message passing between subsystems while monitoring agent 245 of inter- 

15 process data movement path 235 may be employed to monitor and track usage of bandwidth 
employed for passing data between the various subsystem modules. 

The discussion concerning FIG. 2 above has generally been oriented towards a system 
designed to deliver streaming content to a network such as the Internet using, for example, Real 

20 Networks, Quick Time or Microsoft Windows Media streaming formats. However* the 
disclosed systems and methods may be deployed in any other type of system operable to deliver 
content, for example, in web serving or file serving system environments. In such 
environments, the principles may generally remain the same. However for application 
processing embodiments, some differences may exist in the protocols used to communicate and 

25 the method by which data delivery is metered (via streaming protocol, versus TCP/IP 
windowing). 

FIG. 2A illustrates an even more generalized network endpoint computing system that 
may incorporate at least some of the concepts disclosed herein. As shown in Figure 2 A, a 
30 network endpoint system 10 may be coupled to an external network 11. The external network 
11 may include a network switch or router coupled to the front end of the endpoint system 10. 
The endpoint system 1 0 may be alternatively coupled to some other intermediate network node 
of the external network. The system 10 may further include a network engine 9 coupled to an 
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interconnect medium 14. The network engine 9 may include one or more network processors. 
The interconnect medium 14 may be coupled to a plurality of processor units 13 through 
interfaces 13 a. Each processor unit 13 may optionally be couple to data storage (in the 
exemplary embodiment shown each unit is couple to data storage). More or less processor units 
5 13 may be utilized than shown in FIG. 2A. 

The network engine 9 may be a processor engine that performs all protocol stack 
processing in a single processor module or alternatively may be two processor modules (such as 
the network interface engine 1030 and transport engine 1050 described above) in which split 

10 protocol stack processing techniques are utilized. Thus, the functionality and benefits of the 
content delivery system 1010 described above may be obtained with the system 10. The 
interconnect medium 14 may be a distributive interconnection (for example a switch fabric) as 
described with reference to FIG. 1 A. All of the various computing, processing, communication, 
and control techniques described above with reference to FIGS. 1A-1F and 2 may be 

15 implemented within the system 10. It will therefore be recognized that these techniques may be 
utilized with a wide variety of hardware and computing systems and the techniques are not 
limited to the particular embodiments disclosed herein. 

The system 10 may consist of a variety of hardware configurations. In one configuration 
20 the network engine 9 may be a stand-alone device and each processing unit 13 may be a separate 
server. In another configuration the network engine 9 may be configured within the same 
chassis as the processing units 13 and each processing unit 13 may be a separate server card or 
other computing system. Thus, a network engine (for example an engine containing a network 
processor) may provide transport acceleration and be combined with multi-server functionality 
25 within the system 10. The system 10 may also include shared management and interface 
components. Alternatively, each processing unit 13 may be a processing engine such as the 
transport processing engine, application engine, storage engine, or system management engine 
of FIG. 1A. In yet another alternative, each processing unit may be a processor module (or 
processing blade) of the processor engines shown in the system of FIG. 1 A. 

30 

FIG. 2B illustrates yet another use of a network engine 9. As shown in FIG. 2B, a 
network engine 9 may be added to a network interface card 35. The network interface card 35 
may further include the interconnect medium 14 which may be similar to the distributed 
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interconnect 1080 described above. The network interface card may be part of a larger 
computing system such as a server. The network interface card may couple to the larger system 
through the interconnect medium 14. In addition to the functions described above, the network 
engine 9 may perform all traditional functions of a network interface card. 

5 

It will be recognized that all the systems described above (FIGS. 1A, 2, 2A, and 2B) 
utilize a network engine between the external network and the other processor units that are 
appropriate for the function of the particular network node. The network engine may therefore 
offload tasks from the other processors. The network engine also may perform "look ahead 
10 processing" by performing processing on a request before the request reaches whatever 
processor is to perform whatever processing is appropriate for the network node. In this manner, 
the system operations may be accelerated and resources utilized more efficiently. 

* 

It will be understood with benefit of this disclosure that although specific exemplary. 

15 embodiments of hardware and software have been described herein, other combinations of 
hardware and/or software may be employed to achieve one or more features of the disclosed 
systems and methods. Furthermore, it will be understood that operating environment and 
application code may be modified as necessary to implement one or more aspects of the 
disclosed technology, and that the disclosed systems and methods may be implemented using 

20 other hardware models as well as in environments where the application and operating system 
code may be controlled. 
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WHAT IS CLAIMED IS: 

1 . A network content delivery system, comprising: 
at least one storage processor; 

at least one application processor performing endpoint functionality processing; 
5 a system interface connection configured to be coupled to a network; 

at least one network processor, the network processor coupled to the system interface 

connection to receive data from the network; and 
an interconnection between the system processor and the network processor so that the 

network processor may analyze data provided from the network and process the 
1 0 data at least in part and then forward the data to the interconnection so that other 

processing may be performed on the data within the system, 
wherein the interconnection enables a data content delivery path between the storage 

processor and the network processor for providing data from a storage system to 

the network, the data content delivery path by-passing the application processor. 

15 

2. The system of claim 1, wherein the system further comprises an additional processor, the 
additional processor provided in the data content delivery path between the storage processor 
and the network processor. 

20 3 . The system of claim 1 , wherein the application processor is in a content request path 

* 

between the network processor and the storage processor. 

4. The system of claim 1 , wherein the storage processor, application processor and the 

* 

network processor are configured as an asymmetric multi-processor staged pipelined system. 

25 

5. The system of claim 1, wherein the storage processor, application processor and the 
network processor communicate in a peer to peer environment. 

6. The system of claim 5, wherein the application processor is in a content request path 
30 between the network processor and the storage processor. 
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7. The system of claim 6, wherein the system further comprises an additional processor, the 
additional processor provided in the data content delivery path between the storage processor 

4 

and the network processor. 

5 8. The system of claim 7, wherein the additional processor is also in the content request 
path between the network processor and the storage processor. 

9. The system of claim 8, wherein the interconnection comprises a distributed 
interconnection. 

10 

10. The system of claim 9, wherein the distributed interconnection comprises a switch fabric. 

* 

11.. The system of claim 5, wherein the interconnection comprises a distributed 
interconnection. 

15 

12. The system of claim 1 1, wherein the distributed interconnection comprises a switch 

■ 

fabric. 

■ 

13. The system of claim 1, wherein the interconnection comprises a switch fabric. 

20 

14. A method of operating a content delivery system, the method comprising: 

providing a network processor within the content delivery system, the network processor 
being configured to be coupled to an interface which couples the content delivery 
system to a network; 

25 processing data passing through the interface with the network processor; and 

forwarding data from the network processor to a distributed interconnection; 
coupling a storage processor to the distributed interconnection, the storage processor 

configured to also be coupled to a storage system; 
coupling an application processor to the distributed interconnection, the application 
30 processor performing at least some processing upon incoming network data; and 

providing outgoing content through an outgoing content data path from the storage 

processor to the network processor, the outgoing content data path by-passing the 
application processor. 
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1 5. The method of claim 1 4, wherein the network processor analyzes headers of data packets 
transmitted to the network endpoint system from the network. 

5 1 6, The method of claim 1 5, the method further comprising configuring the network 
processor, the storage processor and the application processor in a peer to peer computing 
environment. 

17. The method of claim 16, the network processor, the storage processor and the application 
10 processor configured as an asymmetric multi-processor staged pipeline system. 

1 8. The method of claim 1 7, the method further comprising operating the network endpoint 
system in a staged pipeline processing manner. 

15 19. The method of claim 14, wherein the method further comprises providing an additional 
processor in the outgoing content data path between the storage processor and the network 
processor. 

20. The method of claim 19, wherein the method further comprises providing the application 
20 processor in a content request path between the network processor and the storage processor. 

■ 

21 . The method of claim 14, wherein the method further comprises providing the application 
processor in a content request path between the network processor and the storage processor. 

25 22. The method of claim 14, the method further comprising configuring the network 
processor, the storage processor and the application processor in a peer to peer computing 
environment. 

23. The method of claim 14, wherein the network endpoint system comprises a plurality of 
30 system processors, the method further comprising configuring the network processor and the 
plurality of system processors in a peer to peer computing environment. 
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24. The method of claim 23, wherein the method further comprises providing the application 
processor in a content request path between the network processor and the storage processor. 

25. The method of claim 24, wherein the method further comprises providing an additional 
5 processor in the outgoing content data path between the storage processor and the network 

processor. 

26. The method of claim 25, wherein the method further comprises providing the additional 
processor in the content request path between the storage processor and the network processor. 

10 

27. The method of claim 14, the data forwarded by the network processor being forwarded 
through a switch fabric. 

28. A network connectable content delivery system, comprising: 
a first processor engine; 

a second processor engine, the second processor engine being assigned types of tasks 

different from the types of tasks assigned to the first processor engine; 
a storage processor engine, the storage processor engine being assigned types of tasks 
that are different from the types of tasks assigned to the first and second 
processor engines, the storage processor engine being configured to be coupled to 
a content storage system; 
a distributed interconnection coupled to the first, second and storage processor engines, 
. the tasks of the first, second and storage processor engines being assigned such 
that the system operates in staged pipeline manner through the distributed 
interconnection; 

a content request path, the content request path including the first, second and storage 

processor engines; and 
a content delivery path, the content delivery path by-passing the second processor 
engine. 

29. The system of claim 28, wherein the system further comprises a third processor engine, 
the third processor engine provided in the content delivery path between the storage processor 
engine and the network processor engine. 



15 



20 



25 



30 



WO 02/39263 



53 



PCT/USO 1/46091 



30. The system of claim 29, wherein Ihe third processor engine is also provided in the 
content request path. 

5 31. The system of claim 28, wherein the system further comprises a third processor engine, 
the third processor engine provided in the content request path. 

32. The system of claim 3 1 , wherein the third processor engine is a transport/protocol 
processor engine. 

10 

33. The system of claim 28, wherein the system is a network endpoint system. 

34. The system of claim 28, wherein the first processor engine is a network interface engine 
comprising a network processor, 

15 

35. The system of claim 34, wherein the second processor engine is an application processor 
engine. 

■ 

36. The system of claim 35, wherein the system further comprises a third processor engine, 
20 the third processor engine provided in the content delivery path between the storage processor 

engine and the network processor engine. 

37. The system of claim 36, wherein the third processor engine is also provided in the 
content request path. 

25 

38. The system of claim 35, wherein the system further comprises a third processor engine, 
the third processor engine provided in the content request path. 

39. The system of claim 35, wherein at least one of the network interface engine, the 
30 application processor engine or the storage processor engine comprises multiple processor 

modules operating in parallel. 
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40. The system of claim 39, wherein the application processor engine comprises multiple 
processor modules operating in parallel and the storage processor engine comprises multiple 
processor modules operating in parallel. 

5 41 . The system of claim 40, wherein the network interface processor engine, the application 
processor engine, and the storage processor engine communicate in a peer to peer fashion. 

■ 

42. The system of claim 41 , wherein the distributed interconnect is a switch fabric. 

10 . 43. The system of claim 42, wherein the system further comprises a third processor engine, 
the third processor engine provided in the content delivery path between the storage processor 
engine and the network processor engine. 

44. The system of claim 43, wherein the third processor engine is also provided in the 
1 5 content request path. 

45. The system of claim 42, wherein the system further comprises a third processor engine, 
the third processor engine provided in the content request path. ' 

20 46. The system of claim 28, wherein the distributed interconnect is a switch fabric. 

47. The system of claim 46, wherein at least one of the network interface processor engine, 
the application processor engine, or the storage processor engine comprises multiple processor 
modules operating in parallel. 

25 

48. The system of claim 47, wherein the application processor engine comprises multiple 
processor modules operating in parallel and the storage processor engine comprises multiple 
processor modules operating in parallel. 

30 49. The system of claim 47, wherein the network interface processor engine comprising a 
network processor. 
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50. The system of claim 49, wherein the network interface processor engine, the application 
processor engine, and the storage processor engine communicate in a peer to peer fashion. 

51. A method of providing content from a content delivery system to a network, comprising: 
5 providing a first processor engine; 

providing a second processor engine; . 

providing a storage processor engine, the storage processor engine being configured to 

be coupled to a content storage system; 
providing a distributed interconnection coupled to the first, second and third processor 
10 engines, the tasks of the first, second and third processor engines being assigned 

such that the system operates in staged pipeline maimer through the distributed 

interconnection; 

routing content requests through a request path that includes the first, second and storage 
processor engines; and 

1 5 routing content to be delivered to the network through delivery path that by-passes the 

* 

second processor engine. 



20 



52. The method of claim 5 1 , wherein the method further comprises providing a third 
processor engine in the delivery path. 



53 . The method of claim 52, wherein the third processor engine performs at least some 
protocol processing. 



54. The method of claim 52, wherein the method further comprises providing the third 

» 

25 processor engine in the request path. 

55 . The method of claim 5 1 , wherein the method further comprises providing a third 
processor engine in the request path. 

30 56. The method of claim 5 1 , wherein the first processor engine is a network interface 
processor engine and the second processor engine is an application processor engine. 
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* 

57. The method of claim 56, wherein the network interface processor engine comprises a 
network processor. 

58. The method of claim 57, wherein the method further comprises providing a third 
5 processor engine in the delivery path. 

59. The method of claim 58, wherein the third processor engine performs at least some 
protocol processing. 

1 0 60. The method of claim 58, wherein the method further comprises providing the third 
processor engine in the request path. 

61 . The method of claim 57, wherein the method further comprises providing a third 
processor engine in the request path. 

15 

62. The method of claim 57, wherein the distributed interconnection is a switch fabric. 
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